The importance of cross-lingual information for matching Wikipedia with the Cyc ontology
نویسندگان
چکیده
In this paper we try to answer the question how cross-lingual evidence may improve matching between di erent classi cation schemas. We concentrate speci cally on the task of mapping between Wikipedia categories and Cyc terms as well as the classi cation of Wikipedia articles to the Cyc taxonomy and show how this process may be improved by consuming the evidence that is available in di erent editions of Wikipedia. The results show that the performance of the mapping procedure may be improved from 0.6 to 4.9 percentage points, depending on the number of external Wikipedia editions and the given task.
منابع مشابه
Integrating Cyc and Wikipedia: Folksonomy Meets Rigorously Defined Common-Sense
Integration of ontologies begins with establishing mappings between their concept entries. We map categories from the largest manually-built ontology, Cyc, onto Wikipedia articles describing corresponding concepts. Our method draws both on Wikipedia’s rich but chaotic hyperlink structure and Cyc’s carefully defined taxonomic and common-sense knowledge. On 9,333 manual alignments by one person, ...
متن کاملOntological quality control in large-scale, applied ontology matching
To date, large-scale applied ontology mapping has relied greatly on label matching and other relatively simple syntactic features. In search of more holistic and accurate alignment, we offer a suite of partially overlapping ontology mapping heuristics which allows us to hypothesise matches and test them against the knowledge in our source ontology (OpenCyc). We thereby automatically align our s...
متن کاملMonolingual and cross-lingual ontology matching with CIDER-CL: evaluation report for OAEI 2013
CIDER-CL is the evolution of CIDER, a schema-based ontology alignment system. Its algorithm compares each pair of ontology entities by analysing their similarity at different levels of their ontological context (linguistic description, superterms, subterms, related terms, etc.). Then, such elementary similarities are combined by means of artificial neural networks. In its current version, CIDER...
متن کاملCentralized Clustering Method To Increase Accuracy In Ontology Matching Systems
Ontology is the main infrastructure of the Semantic Web which provides facilities for integration, searching and sharing of information on the web. Development of ontologies as the basis of semantic web and their heterogeneities have led to the existence of ontology matching. By emerging large-scale ontologies in real domain, the ontology matching systems faced with some problem like memory con...
متن کاملCross-Lingual Ontology Mapping - An Investigation of the Impact of Machine Translation
Ontologies are at the heart of knowledge management and make use of information that is not only written in English but also in many other natural languages. In order to enable knowledge discovery, sharing and reuse of these multilingual ontologies, it is necessary to support ontology mapping despite natural language barriers. This paper examines the soundness of a generic approach that involve...
متن کامل